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Abstracts A preliminary routine for solving large-order linear 
systems having sparse matrices is described. An 
iterative approach is used in order to take advantage 
of the absence of most matrix elements. Such a routine 
could handle a 100 x 100 matrix of this character in 
core storage. 



!• Introduction 

in many problems, such as elliptic partial differential equa- 
tions or network analyses, large sets of simultaneous- linear equations 
are encotintered which are characterized by a preponderance of zero ele- 
ments off the main diagonal of the matrix. These matrices are in addi- 
tion symmetric and loaded* on the diagonal. In order to capitalize on 
these properties, the Jacobi iteration described in section 3 was se- 
lected as the basis for a Whirlwind routine. For large systems, the 
iterative Approach to solution has two advantages i In the first place, 
only enough storage registers joeed be provided to take care of the non- 
zero elements in the matrix, whereas reductions sudh as Gauss-Jordan or 
Grout can not so profit by the character of the matrix. Secondly, ini- 
tial data ai^e used in each stage of the iteration, obviating the neces- 
sity of carrying large numbers of superfluous digits in or^r to over- 
come round-off. One can expect, though, that the iteration will be very 
slow in convergence for large systems; and this factor must be weighed 
against the advantages. 

2. Description of the Program, fc TAPE 141-94-107 

2.1 Input Data 

2.1.1 > Preset Parameters 

zml » n order of matrix 

zm£ » a number of non-zero superdi agonal 

elements 



» -Loaded- ^ans ^, pijli ^1 for all 1, 



3-1 
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zffiS ?* location of 1st register of data 
zm4 « location of 1st ten5)orary register 
zffiS « zml+zml not assigned by programmer 
2*1.2 Data Storage 

Data are stored seq.nentially in the following order 



beginning with register zmS: 



b 



n double registers containing right-hand side 

P, any distinctive one-register number, preferably not zero, 
to be used for checking purposes 

e) index to superdi agonal elements (see 2.2) 

d) P, same as b 

e) a double registers listing superdiagonal elements row-wise 

f ) Pj same as b 

g) n double registers containing diagonal elements 
h) P, same as b 

2.2 Index 

In order to identify the proper location of the listed 
superdiagonal elements in the matrix, an index is constructed. M index 
group of registers is included for each row of the matrix. The group 
must contain a number of binary digits (excluding sign digits) greater 
than or equal to the number of superdiagonal elements in the row. 13ius 
for the ith row the index group will contain r registers where 

n-i ^ 15r 4 n-i ■♦• 14, i«l, 2,..«n-l 

except for i=n, where r is taken to be 1. 

The succeeding digits of the group are 1 or according as 
the corresponding superdiagonal element of the matrix is non-zero or zero. 
The signs of the registers are positive. 



Example : 



Row one of a 40 x 40 matrix has non-zero elements in columns 
10, 20, 30, 40. The index group for this row is 



0.00100 
0.04002 
0.00100 



(octal) 



The index groups are listed in order row- 
^^^^^^ ^ Matrix ^ 

J 1 J J 

ii 
i 1 

i i 1 



r-row. 



Index 
0.54000 

0.50000 

0.00000 

0.00000 

0. 00000 
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Note that the last index register always contains zero. 

The number of index registers may be determined by the 
formula 

1 ♦ (Qtl)(R+15Q/2) 

where Q and R are the quotient and remainder in the division (n-l)/l5, 

2.3 Length of Routine 

Instructions; 238 

Temporary registers; 6n registers starting at zm 4 

Data storage; 4n + 2a + (Q+l) (R+15Q/2) + 5 registers 

2.4 Set-up and Checking 

Built into the routine is a cheeking facility to guard 
against gross coding errors in the input data. Before the first entry 
to the iteration, the checking procedure must be performed. It is 
entered by 

sp e8 entry for checking 

Check alarms are generated for any of several errors; 

Check alarm in register Cause 

lei Incorrect number of elements in 

vector b, or incorrect number of 
index registers 

2el Number of off -diagonal elements 

listed disagrees with zm2 

3el Incorrect number of diagonal ele- 

ments 

e5-2 Number of non-zero index bits dis- 

agrees with zm2 

2*5 Operation and Results 

There are two entries for calculating with the routine. 
The first time the routine is used with new data it must be entered by 

sp e? entry with new data 

For further iterations it is entered by 

sp b4 entry for iterations 



^3k+2 " 


^3k+l "^'^Sk+l 


^3k+2 


k « 0,1,2,... 


^3k+3 
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At the end of the kth iteration, the latest results are available in the 
temporary registers starting with zm 4: 

2n registers contain 

2n registers contain 

2n registers contain 

A quick measure of accuracy at this stage is given by the contents of 
register gl. This quantity is the exponent of the greatest Aitken cor- 
rection made during the iteration. The routine provides, as a guard 
against instability, that this quantity does not increase. This i6 done 
by taking 

^3k+3 " ^3k*2 

if the criterion is not met. Thus, for a coB5)lete check of precision the 
quantities Ax„^- might be examined. One must multiply these by some- 
thing like a factor of 5 (this is empirical) to get an estimate of accura- 
cy. 

2»6 Time and Space Comparisons 

In its present farm, the routine requires about 22 ms per 
superdiagonal element plus 28 ms per row for each complete Aitken-Jacobi 
cycle. Since the nximber of nonvanishing elements in matrices of the type 
involved here is practically proportional to the order of the matrix, the 
length of an iteration is proportional to n. The number of iterations 
required depends on the character of the matrix involved and the accura- 
cy desired, but a fair guess is that (for a given accuracy) this number 
is proportional to n, giving a total operating time proportional to n|, 
compared to the time for reduction schemes which is proportional to n . 
Input time is of course reduced using the iterative scheme. 

For 4-decimal accuracy, in a matrix with 4 off -diagonal 
elements per row, limited experience leads one to expect the iterative 
method to take about 75n^ ms while Grout's method (on numbers in core 
memory) takes about Sn*^. Consequently, we expect no profit to be gained 
by iteration on matrices of order less than 25. It h^pens that this is 
just about the maximum matrix which can be handled by the present Crout 
reduction routine. The iterative procedure steps in just at the point 
where storage limitations cause the Crout method to break down. 

Storage requirements are small for the iterative routine. 
Instead of 2n(n+2) registers being required for data storage as in the 
Grout routine, only 8n registers are necessary for various vectors plus 
about n^/'SO index registers and the registers for the non-zero off- 
diagonal elements. A 100 x 100 symmetrix matrix with 400 non-zero off- 
diagonal elements would readily fit into core storage. 

3. Theory 

In matrix notation, the routine solves 
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(1) Ax » b 
by the iteration 

(2) x^^^ « b - (A-I)x^ 

for which the error at the i+lst step is given by 

(3) e^^^ • (I-.A)ej^ 

vrtiere e^ = x^-i and x is the solution of (l). 

By the difference equation (3) we see that the iteration converges if 

(4) U- X. Ul j « 1,2.. .n 

where A. are the eigenvalues of A. In general, this criterion is satis- 

fied if the rows of a loaded Hiatrix are normalized by dividing through 
by the diagonal elements.-'- In the case of electric networks with real 
impedance or of elliptic partial differential equations, the require- 
ment is met, for otherwise a nontrivial solution of the problem 

Ax » 

with vanishing boundary values would exist. 

Apparently convergence is good if the eigenvalues of the nor- 
malized problem are clustered about imity, but poor if there are eigen- 
values near either end of the interval (0,2). A further difficulty of 
the process is that it is first-order, with e. decreasing by a constant 
factor at each stage. Roughly speaking, the process may be made second 
order by using Aitken's ^ process-^„on the components of x. according to 



^^^ ^k+3 



X, 



3k 2 



^ ^3k 



where the symbolic division means a component-by-component operation. 
Unfortunately, it is possible for the denominator in (5) to vanish while 
the numerator remains different from zero, so the Aitksen correction can 
lead to instability. Consequently, in the routine, provision is made to 
limit the magnitude of this correction by taking 

^3k+3 "* ^3k+2 
in cases where the Ait ken correction would be large. 



1. A. S. Householder, Principles of Numerical Analysis , New York, 1953, 
p. 148, p. 126. 
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4, Possible Improvements 

In cases where four or less decimal digits are required, great 
advantage could be obtained by using single length arithmetic (perhaps 
dispensing with the Aitken modification, which would require safeguards 
for the division) « 

For use with elliptic equations, where all the off -diagonal 
elements are equal, the listing and looking up of these elements should 
be eliminated, resulting in a general relaxation routine. Also, since 
all the diagonal elements are equal, the normalizing feature should be 
dropped and the matrix should be put into normalized form before the 
calculation. With these two modifications, a total time for 4-place 
accuracy in solving elliptic equations in two dimensions might be re- 
duced to something like 

2 
lOn ms 

with storage requirements about 2n + n /SO. A 200 x 200 matrix could be 
fit in core storage, but the time it would take is unknown; for an ex- 
trapolation to n ■■ 200 of the time estimates based on experience with 
n ■ 20 is of dubious value. 

In the case of larger matrices, the indexing scheme used here 
may be replaced by a still simpler one with a resulting improvement in 
utilization of storage. This method is to actually store the indices, 
1 and j, for each non-vanlahlng element. These Indices could be packed 
in one Whirlwind register if n^28. When n^SO exceeds the number of 
superdlagonal elements, this procedure leads to a saving in data storage 
space; It also would provide a simpler program. 

The method described here could be used for non-symmetric ma- 
trices; but would probably not be useful, for nonsymmetric problems with 
sparse matrices are rare, and the advantages of iteration arise from 
this particular characteristic. 



Signed 



M. Doughs Mcllroy // 



